5 research outputs found

    Análisis y detección de odio en mensajes de Twitter

    Full text link
    [ES] En la actualidad, la Web constituye un medio donde usuarios de todo el mundo interactúan entre sí, realizando actividades como el comercio digital, la búsqueda de información y la toma de decisiones. De esta forma sitios como las redes sociales han capturado el interés de usuarios y también de analistas. Si bien este fenómeno puede representar una ventaja para el desarrollo de las comunicaciones y la adquisición de información, en este contexto también se han detectado algunas manifestaciones negativas que pueden afectar a diferentes grupos de personas. Los mensajes de odio son un ejemplo de dichos comportamientos negativos, que se publican con frecuencia en redes sociales de gran difusión como Twitter. Estos mensajes expresan odio hacia determinados grupos de personas en función de algún aspecto específico de su identidad, tal como su origen étnico, nacionalidad o religión. Se caracterizan generalmente por ser mensajes virales y por el anonimato de sus autores. Además, diferentes especialistas han identificado que incitan al odio contra el grupo de personas que constituye el objeto de odio de los mensajes, y que incluso, en muchas ocasiones pueden provocar acciones violentas contra dichas personas. Debido a la repercusión que este tipo de publicaciones puede causar en muchas personas, diferentes esfuerzos se han comenzado a realizar. En este sentido, en los últimos años se han organizado varias tareas de evaluación relacionadas con la detección de mensajes de odio. En este trabajo se realiza un análisis de un conjunto de estas tareas, enfocadas en mensajes publicados en Twitter. Se analizan en general las propuestas realizadas por diferentes equipos y en particular nuestras propuestas. Con el estudio de diferentes factores involucrados en las tareas se realiza un conjunto de experimentos. Con lo que se hace una comparación de las estrategias utilizadas y de otras ideas que proponemos. Como resultado se proporciona un resumen de aspectos importantes que pueden servir como guía en el diseño de una aproximación para la detección de mensajes de odio, o como punto de partida para próximos estudios.[EN] Nowdays, the Web constitutes a way where users around the world interact with each other, carrying out important activities such as digital commerce, search of information and decision making. Thus, sites like social networks have captured the interest of both users and analysts. This phenomenon may represent an advantage for the development of communications and the acquisition of information. However, some negative behaviour, that may affect different groups of people, have also been detected in this context. Hate speech is an example of such negative behaviour, which is frequently published on popular social networks such as Twitter. It expresses hatred towards certain groups of people based on some specific aspect of their identity, such as their ethnicity, nationality or religion. It is generally characterized by being viral messages and by the anonymity of their authors. Specialists have identified that it incites hatred against people who are the object of hate in the messages, and that it can bring on violent actions against them in many occasions. Due to the impact this can cause on many people, different efforts have begun to develop. In this sense, several evaluation tasks related to the detection of hate speech have been organized in recent years. In this work we carry out an analysis of a set of these tasks focused on messages published on Twitter. We analyze the proposed approaches made by different teams in general, and our proposals in particular. A set of experiments is performed with the study of the different factors involved in the tasks. In this way a comparison is made of the strategies used and other ideas that we propose. As a result, we provide a summary of some important aspects. It can be useful as a guide for future studies or in the design of an approach to the detection of hate speech.De La Peña Sarracén, GL. (2019). Análisis y detección de odio en mensajes de Twitter. http://hdl.handle.net/10251/129782TFG

    Offensive keyword extraction based on the attention mechanism of BERT and the eigenvector centrality using a graph representation

    Full text link
    [EN] The proliferation of harmful content on social media affects a large part of the user community. Therefore, several approaches have emerged to control this phenomenon automatically. However, this is still a quite challenging task. In this paper, we explore the offensive language as a particular case of harmful content and focus our study in the analysis of keywords in available datasets composed of offensive tweets. Thus, we aim to identify relevant words in those datasets and analyze how they can affect model learning. For keyword extraction, we propose an unsupervised hybrid approach which combines the multi-head self-attention of BERT and a reasoning on a word graph. The attention mechanism allows to capture relationships among words in a context, while a language model is learned. Then, the relationships are used to generate a graph from what we identify the most relevant words by using the eigenvector centrality. Experiments were performed by means of two mechanisms. On the one hand, we used an information retrieval system to evaluate the impact of the keywords in recovering offensive tweets from a dataset. On the other hand, we evaluated a keyword-based model for offensive language detection. Results highlight some points to consider when training models with available datasets.Peña-Sarracén, GLDL.; Rosso, P. (2021). Offensive keyword extraction based on the attention mechanism of BERT and the eigenvector centrality using a graph representation. Personal and Ubiquitous Computing. 1-13. https://doi.org/10.1007/s00779-021-01605-511

    Profiling hate speech spreaders on twitter task at PAN 2021

    Full text link
    [EN] This overview presents the Author Profiling shared task at PAN 2021. The focus of this year¿s task is on determining whether or not the author of a Twitter feed is keen to spread hate speech. The main aim is to show the feasibility of automatically identifying potential hate speech spreaders on Twitter. For this purpose a corpus with Twitter data has been provided, covering the English and Spanish languages. Altogether, the approaches of 66 participants have been evaluated.First of all, we thank the participants: again 66 this year, as the previous year on Profiling Fake News Spreaders! We have to thank also Martin Potthast, Matti Wiegmann, Nikolay Kolyada, and Magdalena Anna Wolska for their technical support with the TIRA platform. We thank Symanto for sponsoring again the award for the best performing system at the author profiling shared task. The work of Francisco Rangel was partially funded by the Centre for the Development of Industrial Technology (CDTI) of the Spanish Ministry of Science and Innovation under the research project IDI-20210776 on Proactive Profiling of Hate Speech Spreaders - PROHATER (Perfilador Proactivo de Difusores de Mensajes de Odio). The work of the researchers from Universitat Politècnica de València was partially funded by the Spanish MICINN under the project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31), and by the Generalitat Valenciana under the project DeepPattern (PROMETEO/2019/121). This article is also based upon work from the Dig-ForAsp COST Action 17124 on Digital Forensics: evidence analysis via intelligent systems and practices, supported by European Cooperation in Science and Technology.Rangel, F.; Peña-Sarracén, GLDL.; Chulvi-Ferriols, MA.; Fersini, E.; Rosso, P. (2021). Profiling hate speech spreaders on twitter task at PAN 2021. CEUR. 1772-1789. http://hdl.handle.net/10251/1906631772178

    Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection.

    Full text link
    [EN] The paper gives a brief overview of the three shared tasks to be organized at the PAN 2021 lab on digital text forensics and stylometry hosted at the CLEF conference. The tasks include authorship verification across domains, author profiling for hate speech spreaders, and style change detection for multi-author documents. In part the tasks are new and in part they continue and advance past shared tasks, with the overall goal of advancing the state of the art, providing for an objective evaluation on newly developed benchmark datasets.The work of the researchers from Universitat Politecnica de Valencia was partially funded by the Spanish MICINN under the project MISMISFAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31), and by the Generalitat Valenciana under the project DeepPattern (PROMETEO/2019/121).Bevendorff, J.; Chulvi-Ferriols, MA.; Peña-Sarracén, GLDL.; Kestemont, M.; Manjavacas, E.; Markov, I.; Mayerl, M.... (2021). Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection. Springer. 567-573. https://doi.org/10.1007/978-3-030-72240-1_6656757

    Detección Multilingüe y Multimodal de Mensajes de Odio en Redes Sociales

    Full text link
    [ES] En esta tesis doctoral proponemos el diseño y desarrollo de tecnologías para el tratamiento automático de mensajes de odio. La hipótesis en la que se sustenta el proyecto es que la detección de odio puede mejorar al incorporar, en el procesamiento de textos, otras fuentes de información como las imágenes, que en varias ocasiones son compartidas junto a dichos mensajes. De esta forma, pretendemos desarrollar estrategias para la detección automática de odio desde un enfoque multimodal. Por otra parte, en el marco del proyecto tendremos en cuenta el análisis multilingüe de mensajes de odio, haciendo uso de estrategias de transferencia de aprendizaje para el tratamiento en idiomas con poca información. Para el desarrollo de la investigación, nos planteamos construir un conjunto de datos que permita el procesamiento multilingüe y multimodal. En general, el trabajo estará enfocado en técnicas de aprendizaje profundo en la propuesta de aproximaciones para la detección de odio.[EN] In this doctoral thesis we propose the design and development of technologies for the automatic hate speech detection. The hypothesis on which the project is based is that hate detection can be improved by incorporating other sources of information such as images into text processing. In this way, we intend to develop strategies for automatic hate detection from a multimodal approach. Furthermore, the project will take into account the multilingual analysis of hate speech, using transfer learning strategies for the treatment in languages with little information. For the development of the research, we plan to build a dataset that allows multilingual and multimodal processing. In general, the work will be focused on deep learning techniques for the proposal of approaches for hate speech detection.Peña-Sarracén, GLDL. (2021). Multilingual and Multimodal Hate Speech Detection in Social Media. CEUR Workshop Proceedings, vol. 2802. 23-30. http://hdl.handle.net/10251/191338233
    corecore